However , there are some problems in the original method , such as low availability of the extracted fragments , error position of the match results , and high complexity of the algorithm etc . in this thesis , a novel method to extract fragments is proposed , and it improves the availability of the extracted fragments ; a novel match method called her is also proposed , which is a hybrid one combining the object ’ s edge and region features , and its advancement is showed in the experimental results ; a prototypal top - down image object segmentation system named io - seg is devised and implemented in this thesis , which is based on class - specific fragment and integrates the methods proposed in this thesis 由于csf - seg方法存在提取的子圖可用性差、匹配位置誤差大、計算復雜性高等問題,通過深入分析和研究,本文提出了一種新的提取子圖的方法,提高了產生的子圖的有效性;在深入研究匹配方法的基礎上,提出了一種新的結合對象邊緣和區域的形狀匹配方法? ? her方法,通過實驗證明了該方法的優越性;最后,結合文中提出的方法設計并實現了一個top - down的圖像對象分割原型系統io - seg 。
Within this framework , according to the semma ( sample - explore - modify - model - assess ) from sas , the customer segmentation system based on data warehouse was designed and implemented , which is running on the hardware platform of the business analysis as the subsystem of it , including data collection and preprocess module , clustering analysis module , model application module and system administration module 該系統以k - m - d算法為核心,根據sas公司提出semma ( sample - explore - modify - model - assess )方法論,由數據采集與預處理模塊、聚類分析模塊、模型應用模塊和系統管理模塊構成。該系統成為經營分析系統的子系統,運行于經營分析系統的硬件平臺之上。
Chinese information processing model is added to the traditional search engine , which can make search engine intelligent and personalized . chinese automatic word segmentation is the first work in chinese information processing . in this paper , a chinese word segmentation system is studied , which fits for intelligence search engine 針對歧義字段的劃分問題,提出了歧義字段劃分的三個原則,在三原則的基礎上給出了“二字續分法”分詞的方案,該方案能夠快速有效的分解大部分的歧義字段,具有很高的實用價值。
In last part of this paper , we introduce the general - purpose word segmentation system in modern chinese ( gpws ) and analyse the set of criteria for the evaluating a general - purpose segmentation system in terms of its comprehensiveness , extensibility and adaptiveness , and interactiveness besides precision 其次,本文概要的分析了通用型分詞系統的難點,闡述了gpws的解決方案,給出了通用分詞系統的評價標準;并提出了交互式分詞系統的概念,給出了一種簡單的交互式方法。
Disambiguity and recognition of unknown words are most important points for design of word segmentation systems . in this paper , firstly , we introduce an applied strategy to disambiguity . then we put forward an integrated and fast recognition strategy of proper noun , including chinese person names , chinese place names , translated foreign names and corporation & organization names , in modern chinese word segmentation system , which successfully resolves the conflict among these proper nouns and ordinary words 本文首先闡述了現代漢語通用分詞系統( gpws )中歧義切分技術和專名識別技術,在歧義切分技術中,提出了一種切分規則庫與基于歧義知識庫動態校正相結合的實用歧義處理策略;在專名識別技術中,本文提出了一種專名(包括譯名在內的人名、地名、企業字號、企業名和機構名等)一體化、快速識別方法。
Ocr system is the optical character recognition system , which has far - ranging use in many aspects such as auto - input and digital library . it is composed of two primary part - online recognition and offline recognition . the page segmentation system we have studied in this paper is a crucial part of the offline recognition Ocr ( opticalcharacterrecognition )系統的中文全稱是光學字符識別系統,它廣泛應用于文字的自動錄入,實際上是一種實現文字自動輸入的快捷省力的輸入方法,能夠極大地減輕數據錄入工作的強度、提高數據錄入的速度,可廣泛應用于電子出版、 internet網上資源數據庫和數字圖書館的建設。
This dissertation discusses image segmentation technology based on the image visual features , the image neighborhood moment analysis and the related image detection and segmentation system . and the key functional modules ’ analyzed are as follows : the image neighborhood moment ’ s analysis , the adaptive clustering after the visual moment transformation , the boundary detection of the object image and the extraction of the object image area of similar visual brightness 本文討論了圖像視覺特征鄰域矩分析和基于視覺特征矩的檢測分割技術,并分析系統中的關鍵模塊:圖像鄰域分析,視覺特征矩變換后的自適應聚類,目標圖像的邊緣檢測,視同灰度區域的視同灰度提取。
Meanwhile , the online chinese word segmentation systems only provide test function and have such defects as processing small - scale text , inconvenient usage , no interface for program calling , etc . grid is a novel technology following the internet and www in recent years , which can offer distributed parallel environment 同時,在分詞應用方面,基于網絡的在線測試分詞系統僅提供測試功能,存在只能處理少量文本、用戶使用不方便、程序無法直接調用等缺點。網格是近年來繼internet 、 www技術后興起的一種新技術,能為復雜應用提供分布式并行環境。
We attempt to exploit various machine - learning techniques to learn the heuristic knowledge from users " experiences , so that the image segmentation system can have some human ability in adaptively selecting optimal algorithm and corresponding parameters . learning based image segmentation system can be classified i 希望使用機器學習的技術,通過從用戶對訓練圖像集的分割和評價中學習相應的啟發式知識,以此使系統能夠根據圖像的特征,為不同的圖像靈活的選擇參數或算法,從而自動實現令人滿意的分割。